On a Minimal Spanning Tree Approach in the Cluster Validation Problem
نویسندگان
چکیده
In this paper, a method for the study of cluster stability is purposed. We draw pairs of samples from the data, according to two sampling distributions. The first distribution corresponds to the high density zones of data-elements distribution. Thus it is associated with the clusters cores. The second one, associated with the cluster margins, is related to the low density zones. The samples are clustered and the two obtained partitions are compared. The partitions are considered to be consistent if the obtained clusters are similar. The resemblance is measured by the total number of edges, in the clusters minimal spanning trees, connecting points from different samples. We use the Friedman and Rafsky two sample test statistic. Under the homogeneity hypothesis, this statistic is normally distributed. Thus, it can be expected that the true number of clusters corresponds to the statistic empirical distribution which is closest to normal. Numerical experiments demonstrate the ability of the approach to detect the true number of clusters.
منابع مشابه
SOLVING A STEP FIXED CHARGE TRANSPORTATION PROBLEM BY A SPANNING TREE-BASED MEMETIC ALGORITHM
In this paper, we consider the step fixed-charge transportation problem (FCTP) in which a step fixed cost, sometimes called a setup cost, is incurred if another related variable assumes a nonzero value. In order to solve the problem, two metaheuristic, a spanning tree-based genetic algorithm (GA) and a spanning tree-based memetic algorithm (MA), are developed for this NP-hard problem. For compa...
متن کاملA Metaheuristic Algorithm for the Minimum Routing Cost Spanning Tree Problem
The routing cost of a spanning tree in a weighted and connected graph is defined as the total length of paths between all pairs of vertices. The objective of the minimum routing cost spanning tree problem is to find a spanning tree such that its routing cost is minimum. This is an NP-Hard problem that we present a GRASP with path-relinking metaheuristic algorithm for it. GRASP is a multi-start ...
متن کاملLC Note: LC-TOOL-2004-020 arXiv:physics/0409039 CALORIMETER CLUSTERING WITH MINIMAL SPANNING TREES
We present a top-down approach to calorimeter clustering. An algorithm based on minimal spanning tree theory is described briefly. We present a top-down approach to calorimeter clustering. An algorithm based on minimal spanning tree theory is described briefly.
متن کاملAn Optimal iterative Minimal Spanning tree Clustering Algorithm for images
-Limited Spatial resolution, poor contrast, overlapping intensities, noise and intensity in homogeneities variation make the assignment of segmentation of medical images is greatly difficult. In recent days, mathematical algorithm supported automatic segmentation system plays an important role in clustering of imaging. The minimal spanning tree algorithm is capable of detecting clustering with ...
متن کاملOPTIMIZATION OF TREE-STRUCTURED GAS DISTRIBUTION NETWORK USING ANT COLONY OPTIMIZATION: A CASE STUDY
An Ant Colony Optimization (ACO) algorithm is proposed for optimal tree-structured natural gas distribution network. Design of pipelines, facilities, and equipment systems are necessary tasks to configure an optimal natural gas network. A mixed integer programming model is formulated to minimize the total cost in the network. The aim is to optimize pipe diameter sizes so that the location-alloc...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Informatica, Lith. Acad. Sci.
دوره 20 شماره
صفحات -
تاریخ انتشار 2009